46 results found.
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Slovenian
Availability:
Freely Available
License:
CreativeCommons
Size:
54 hours Production Status:
Newly created-finished
Use:
Speech Recognition/Understanding
-
Paper title:The SI TEDx-UM speech database: a new Slovenian Spoken Language Resource
-
Paper track:Speech
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Andrej Zgank | University of Maribor | SI |
| Author 2 | Mirjam Sepesy Maucec | University of Maribor | SI |
| Author 3 | Darinka Verdonik | University of Maribor | SI |
| Main Contact | Andrej Zgank | University of Maribor | None |
Documentation:
<Not Specified>
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Slovenian
Availability:
From Data Center(s)
License:
<Not Specified>
Size:
36 hours Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:The SI TEDx-UM speech database: a new Slovenian Spoken Language Resource
-
Paper track:Speech
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Andrej Zgank | University of Maribor | SI |
| Author 2 | Mirjam Sepesy Maucec | University of Maribor | SI |
| Author 3 | Darinka Verdonik | University of Maribor | SI |
| Main Contact | Andrej Zgank | University of Maribor | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
Slovenian
Availability:
From Data Center(s)
License:
ELRA
Size:
36 hours Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:The Slovene BNSI Broadcast News database and reference speech corpus GOS: Towards the uniform guidelines for future work
-
Paper track:Speech
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Andrej Zgank | University of Maribor | SI |
| Author 2 | Ana Zwitter Vitez | Trojina, Institute for Applied Slovene Studies | SI |
| Author 3 | Darinka Verdonik | University of Maribor | SI |
| Main Contact | Andrej Zgank | University of Maribor | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
English Slovenian
Availability:
Freely Available
License:
<Not Specified>
Size:
27924210 words Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Producing Monolingual and Parallel Web Corpora at the Same Time - SpiderLing and Bitextor's Love Affair
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Nikola Ljubešić | University of Zagreb | HR | ||
| Author 2 | Miquel Esplà-Gomis | Universitat d'Alacant | ES | ||
| Author 3 | Antonio Toral | Dublin City Unversity | IE | ||
| Author 4 | Sergio Ortiz Rojas | <Not Specified> | None | ||
| Author 5 | Filip Klubička | University of Zagreb | HR | ||
| Main Contact | Nikola Ljubešić | Jožef Stefan Institute | None | University of Zagreb | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
Slovenian
Availability:
Freely Available
License:
<Not Specified>
Size:
42919 <Not Specified>Production Status:
Existing-used
Use:
Word Sense Disambiguation
-
Paper title:Cleaning noisy wordnets
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Benoît Sagot | Inria | None |
| Author 2 | Darja Fišer | University of Ljubljana | None |
| Main Contact | Benoît Sagot | Inria | FR |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Trilingual
Languages:
Croatian Serbian Slovenian
Availability:
From Owner
License:
<Not Specified>
Size:
<Not Specified> <Not Specified>Production Status:
Existing-used
Use:
Evaluation/Validation
-
Paper title:Diacritics Restoration Using Neural Networks
-
Paper track:Evaluation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Jakub Náplava | Charles University, Institute of Formal and Applied Linguistics | CZ | ||
| Author 2 | Milan Straka | Charles University | None | ||
| Author 3 | Pavel Straňák | Charles University in Prague | CZ | ||
| Author 4 | Jan Hajic | Charles University in Prague | CZ | Charles University | CZ |
| Main Contact | Jakub Náplava | Charles University, Institute of Formal and Applied Linguistics | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
Slovenian
Availability:
Freely Available
License:
CC-BY-SA 3.0
Size:
38465311 Production Status:
Newly created-finished
Use:
<Not Specified>
-
Paper title:TweetCaT: a tool for building Twitter corpora of smaller languages
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Nikola Ljubešić | University of Zagreb | SI |
| Author 2 | Darja Fišer | University of Ljubljana | SI |
| Author 3 | Tomaž Erjavec | Dept. of Knowledge Technologies, Jožef Stefan Institute | SI |
| Main Contact | Nikola Ljubešić | Jožef Stefan Institute | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
Croatian Slovenian
Availability:
Freely Available
License:
<Not Specified>
Size:
4222 Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Quality Estimation for Synthetic Parallel Data Generation
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Raphael Rubino | Prompsit Language Engineering | DE |
| Author 2 | Antonio Toral | Dublin City Unversity | NL |
| Author 3 | Nikola Ljubešić | University of Zagreb | SI |
| Author 4 | Gema Ramírez-Sánchez | Prompsit Language Engineering | ES |
| Main Contact | Raphael Rubino | DFKI | None |
Documentation:
<Not Specified>Language Type:
Trilingual
Languages:
Croatian Serbian Slovenian
Availability:
Freely Available
License:
CC-BY-SA-NC
Size:
30 GByte Production Status:
Newly created-in progress
Use:
Corpus Creation/Annotation
-
Paper title:Corpus-Based Diacritic Restoration for South Slavic Languages
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Nikola Ljubešić | University of Zagreb | HR | ||
| Author 2 | Tomaž Erjavec | Dept. of Knowledge Technologies, Jožef Stefan Institute | SI | ||
| Author 3 | Darja Fišer | University of Ljubljana | SI | ||
| Main Contact | Nikola Ljubešić | Jožef Stefan Institute | None | University of Zagreb | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
Croatian Slovenian
Availability:
Freely Available
License:
<Not Specified>
Size:
573 Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Quality Estimation for Synthetic Parallel Data Generation
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Raphael Rubino | Prompsit Language Engineering | DE |
| Author 2 | Antonio Toral | Dublin City Unversity | NL |
| Author 3 | Nikola Ljubešić | University of Zagreb | SI |
| Author 4 | Gema Ramírez-Sánchez | Prompsit Language Engineering | ES |
| Main Contact | Raphael Rubino | DFKI | None |
Documentation:
Available documentation in English




